Classifying Papers from Different Computer Science Conferences
نویسندگان
چکیده
This paper analyzes what stylistic characteristics differentiate different styles of writing, and specifically types of different A-level computer science articles. To do so, we compared various full papers using stylistic feature sets and a supervised machine learning method. We report on the success of this approach in identifying papers from the last 6 years of the following three conferences: SIGIR, ACL, and AAMAS. This approach achieves high accuracy results of 95.86%, 97.04%, 93.22%, and 92.14% for the following four classification experiments: (1) SIGIR / ACL, (2) SIGIR / AAMAS, (3) ACL / AAMAS, and (4) SIGIR / ACL / AAMAS, respectively. The Part of Speech (PoS) and the Orthographic sets were superior to all others and have been found as key components in different types of writing.
منابع مشابه
Peer-Selected “Best Papers”—Are They Really That “Good”?
BACKGROUND Peer evaluation is the cornerstone of science evaluation. In this paper, we analyze whether or not a form of peer evaluation, the pre-publication selection of the best papers in Computer Science (CS) conferences, is better than random, when considering future citations received by the papers. METHODS Considering 12 conferences (for several years), we collected the citation counts f...
متن کاملچهار دهه فعالیت علمی ایران از منظر مقالات همایشها، مقالات پر استناد و داغ و مقالات دسترسی آزاد با نگاهی به قانون برنامه توسعه اقتصادی ، اجتماعی، فرهنگی کشور
This study aims to investigate Iran scientific production Pre-revolutionary by 2016 with the emphasis on the conferences proceedings, highly cited and hot papers, and open access papers, in the light of the Law of Economic, Social, and Cultural Development Plan of Iran. Descriptive – analytical method used. To achieve research objectives data extracted from Clarivate Analytics (Thomson Reuters)...
متن کاملAlgorithmic Detection of Computer Generated Text
Computer generated academic papers have been used to expose a lack of thorough human review at several computer science conferences. We assess the problem of classifying such documents. After identifying and evaluating several quantifiable features of academic papers, we apply methods from machine learning to build a binary classifier. In tests with two hundred papers, the resulting classifier ...
متن کاملThe skewness of computer science
Computer science is a relatively young discipline combining science, engineering, and mathematics. The main flavors of computer science research involve the theoretical development of conceptual models for the different aspects of computing and the more applicative building of software artifacts and assessment of their properties. In the computer science publication culture, conferences are an ...
متن کاملMarkov Topic Models
We develop Markov topic models (MTMs), a novel family of generative probabilistic models that can learn topics simultaneously from multiple corpora, such as papers from different conferences. We apply Gaussian (Markov) random fields to model the correlations of different corpora. MTMs capture both the internal topic structure within each corpus and the relationships between topics across the co...
متن کامل